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ABSTRACT 

Model-based  prognostics  captures  system  knowledge  in 
the  form  of  physics-based  models  of  components,  and 
how  they  fail,  in  order  to  obtain  accurate  predictions  of 
end  of  life  (EOL).  EOL  is  predicted  based  on  the  esti¬ 
mated  current  state  distribution  of  a  component  and  ex¬ 
pected  profiles  of  future  usage.  In  general,  this  requires 
simulations  of  the  component  using  the  underlying  mod¬ 
els.  In  this  paper,  we  develop  a  simulation-based  pre¬ 
diction  methodology  that  achieves  computational  effi¬ 
ciency  by  performing  only  the  minimal  number  of  sim¬ 
ulations  needed  in  order  to  accurately  approximate  the 
mean  and  variance  of  the  complete  EOL  distribution. 
This  is  performed  through  the  use  of  the  unscented  trans¬ 
form,  which  predicts  the  means  and  covariances  of  a 
distribution  passed  through  a  nonlinear  transformation. 
In  this  case,  the  EOL  simulation  acts  as  that  nonlinear 
transformation.  In  this  paper,  we  review  the  unscented 
transform,  and  describe  how  this  concept  is  applied  to 
efficient  EOL  prediction.  As  a  case  study,  we  develop 
a  physics-based  model  of  a  solenoid  valve,  and  perform 
simulation  experiments  to  demonstrate  improved  com¬ 
putational  efficiency  without  sacrificing  prediction  accu¬ 
racy. 

1.  INTRODUCTION 

Prognostics  is  an  essential  technology  for  improving  sys¬ 
tem  safety,  reliability,  and  availability.  Prognostics  deals 
with  determining  the  health  state  of  components,  and 
projecting  the  evolution  of  the  health  into  the  future  to 
make  end  of  life  (EOL)  and  remaining  useful  life  (RUL) 
predictions.  Model-based  prognostics  approaches  per¬ 
form  these  tasks  with  the  aid  of  a  model  that  captures 
knowledge  about  the  system,  its  components,  and  their 
failures,  typically  in  the  form  of  a  physics-based  model 
that  is  derived  from  first  principles  (Roemer,  Byington, 
Kacprzynski,  &  Vachtsevanos,  2005;  Byington,  Wat- 
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son,  Edwards,  &  Stoelting,  2004;  Saha  &  Goebel,  2009; 
Daigle  &  Goebel,  2010). 

The  expression  of  confidence  in  a  prediction  provides 
important  information  to  a  decision  maker.  It  is  therefore 
critical  to  properly  represent  and  process  various  sources 
of  uncertainty.  EOL  and  RUL  can  then  be,  for  exam¬ 
ple,  embodied  as  probability  distributions.  These  distri¬ 
butions  are  often  dominated  by  the  uncertainty  of  future 
usage.  For  the  system  considered  here,  we  assume  a  sin¬ 
gle  trajectory  of  future  usage,  which,  for  a  given  fault 
mode,  makes  the  distribution  unimodal  (but  not  neces¬ 
sarily  Gaussian).  In  this  case,  the  means  and  variances 
of  these  distributions  are  the  most  important  and  use¬ 
ful  pieces  of  information,  as  they  provide  information 
on  both  the  accuracy  and  spread  of  the  prediction.  Of¬ 
ten,  the  EOL  distribution  is  obtained  starting  with  a  dis¬ 
tribution  describing  the  current  state  of  the  system,  and 
propagating  that  distribution  forward  to  EOL.  If  the  rep¬ 
resentation  of  the  distribution  is  sample-based,  as  with 
particle  filters,  then  this  is  straightforward,  otherwise,  in 
general,  a  sample-based  representation  is  needed,  as  of¬ 
ten  an  analytical  solution  is  unavailable  or  intractable. 
Prediction  is  then  performed  by  simulating  each  sample 
forward  to  EOL.  However,  this  task  can  be  computation¬ 
ally  prohibitive  due  to  the  large  number  of  samples  often 
needed  to  accurately  represent  the  state  distribution. 

In  this  paper,  we  develop  a  novel  method  to  increase 
the  efficiency  of  the  prediction  step.  We  do  this  us¬ 
ing  the  unscented  transform  (Julier  &  Uhlmann,  1997), 
which  is  a  method  to  predict  the  mean  and  covariance  of 
a  distribution  that  undergoes  a  nonlinear  transformation. 
In  this  case,  the  nonlinear  transformation  is  the  simula¬ 
tion  to  EOL.  The  unscented  transform  approximates  the 
given  distribution  with  deterministically  selected  sam¬ 
ples,  which  are  then  transformed,  and  the  mean  and  co- 
variance  of  the  EOL  distribution  may  be  computed  from 
the  transformed  samples.  Effectively,  only  the  minimal 
amount  of  simulations  are  being  performed,  and  the  sam¬ 
ples  are  chosen  in  such  a  way  that  the  predicted  mean 
and  covariance  closely  approximate  the  mean  and  co- 
variance  obtained  by  transforming  the  entire  distribution, 
thus  achieving  the  same  result  at  a  fraction  of  the  compu¬ 
tational  cost,  both  in  time  and  memory.  Since  prediction 
is  the  main  goal  of  prognostics,  computationally  efficient 
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Figure  1 :  Prognostics  architecture. 


prediction  is  of  utmost  importance.  Efficient  prediction 
methods  take  less  time,  so,  therefore,  more  predictions 
can  be  made  at  a  faster  rate. 

We  review  the  common  forms  of  the  unscented  trans¬ 
form,  and  develop  the  new  prediction  methodology  as 
part  of  our  model-based  prognostics  framework  (Daigle 
&  Goebel,  2009,  2010).  As  a  case  study,  we  construct 
a  detailed  physics-based  model  of  a  solenoid  valve  that 
includes  models  of  different  damage  mechanisms  and 
their  progression.  Solenoid  valves  have  application  in 
many  domains,  and  reliable  performance  of  these  valves 
is  crucial  to  many  complex  systems  (Tansel,  Perotti,  Ye- 
nilmez,  &  Chen,  2005).  We  run  a  set  of  simulation- 
based  prognostics  experiments,  using  the  solenoid  valve 
model,  to  demonstrate  the  application  of  the  new  pre¬ 
diction  methodology  and  compare  it  to  the  baseline  ap¬ 
proach. 

The  paper  is  organized  as  follows.  Section  2.  describes 
the  prognostics  approach.  Section  3.  presents  the  model¬ 
ing  methodology  and  develops  the  model  of  the  solenoid 
valve.  Section  4.  discusses  the  damage  estimation  ap¬ 
proach.  Section  5.  overviews  the  unscented  transform 
and  develops  the  new  prediction  procedure.  Section  6. 
presents  comprehensive  simulation  experiments  apply¬ 
ing  the  framework  to  the  solenoid  valve  case  study.  Sec¬ 
tion  7.  concludes  the  paper. 

2.  PROGNOSTICS  APPROACH 

The  problem  of  prognostics  is  to  predict  the  EOL  and/or 
the  RUL  of  a  component.  In  this  section,  we  first 
formally  define  the  problem  of  model-based  prognos¬ 
tics.  We  then  describe  a  general  model-based  architec¬ 
ture  within  which  a  prognostics  solution  may  be  imple¬ 
mented. 

2.1  Problem  Formulation 

In  a  general  model-based  prognostics  approach,  the  sys¬ 
tem  model  may  be  given  by 

*(<)  =  f(f,x(f),0(f),u(f),v(f)) 

y  (t)  =  h(t,x(t),0(t),u(t),n(t)), 

where  x(t)  £  Rnx  is  the  state  vector,  0(t)  €  Rn#  is 
the  parameter  vector,  u(t)  £  R"u  is  the  input  vector, 
v(t)  £  K”’'  is  the  process  noise  vector,  f  is  the  state 
equation,  y  (/)  £  is  the  output  vector,  n(t)  £  Knr*  is 
the  measurement  noise  vector,  and  h  is  the  output  equa¬ 
tion.  The  parameters  0(t)  evolve  by  some  unknown  pro¬ 
cess,  but,  in  practice,  are  typically  considered  to  be  con¬ 
stant. 

Our  goal  is  to  predict  EOL  at  a  given  time  point  tp 
using  the  discrete  sequence  of  observations  up  to  time 
tp,  denoted  as  yo:t;,.  EOL  is  defined  as  the  time  point 
at  which  the  component  no  longer  meets  a  functional  re¬ 
quirement  (e.g.,  a  valve  does  not  open  in  the  required 
amount  of  time).  This  point  is  often  linked  to  a  damage 


threshold,  beyond  which  the  component  fails  to  func¬ 
tion  properly.  In  general,  we  may  express  this  thresh¬ 
old  as  a  function  of  the  system  state  and  parameters, 
TEOL(x-(t),0(t)),  which  determines  whether  EOL  has 
been  reached,  where 

rji  ,  /  1,  if  EOL  is  reached 

TBOL(x(i),0(i))  =  |  0}  otherwise. 

Using  this  function,  we  can  formally  define  EOL  with 

EOL(tP )  =  argmin  TEOL(x(t),0(t))  =  1, 

t^t  p 

and  RUL  with 

RUL(tP)  =  EOL(tP)  -  tP. 

Due  to  the  many  sources  of  uncertainty  that  exist  in  the 
prediction  problem,  it  is  much  more  useful  to  compute  a 
probability  distribution  of  the  EOL  or  RUL,  rather  than 
a  single  prediction  point.  The  goal,  then,  is  to  compute, 
at  time  tP,  p(EOL(tp)  |y0:tp)  or  p(RUL(tP)\y0,tp). 

2.2  Prognostics  Architecture 

We  adopt  a  model-based  approach,  wherein  we  develop 
detailed  physics-based  models  of  components  and  sys¬ 
tems  that  include  descriptions  of  how  fault  parameters 
evolve  in  time.  These  models  depend  on  unknown  and 
possibly  time-varying  wear  parameters,  0{t).  Therefore, 
our  solution  to  the  prognostics  problem  takes  the  per¬ 
spective  of  joint  state-parameter  estimation.  In  discrete 
time  k,  we  estimate  x/c  and  0k,  and  use  these  estimates 
to  predict  EOL  and  RUL  at  desired  time  points.  Using 
p(xkP,0kp\yo:kp)  at  prediction  time  kp,  we  compute 
p{EOLkp\y0:kp)  and  p(RULkp  |yo:fcj=  )- 

We  employ  the  prognostics  architecture  in  Fig.  1 
(Daigle  &  Goebel,  2010).  The  system  is  provided  with 
inputs  Ufc  and  provides  measured  outputs  yk.  The  fault 
detection,  isolation,  and  identification  (FDII)  module  de¬ 
termines  a  fault  set  F,  which  is  used  by  the  damage  es¬ 
timation  module  to  determine  estimates  of  the  states  and 
unknown  parameters,  represented  as  a  probability  dis¬ 
tribution  p(xk,  0k\yO:k).  The  prediction  module  uses 
this  distribution,  along  with  hypothesized  future  inputs, 
to  compute  EOL  and  RUL  as  probability  distributions 
p{EOLkp\y0:kp)  and  p(RULkp\y0:kp).  In  this  paper, 
we  focus  on  the  damage  estimation  and  prediction  mod¬ 
ules,  and  assume  a  solution  to  EDIT 

3.  SOLENOID  VALVE  MODELING 

We  apply  our  prognostics  approach  to  a  solenoid  valve, 
and  develop  a  physics-based  model  of  its  nominal  and 
faulty  behavior.  A  typical  three-way,  two-position 
solenoid  valve  for  controlling  gas  flow  is  shown  in  Fig.  2. 
The  valve  is  held  in  its  de-energized  position  by  the  re¬ 
turn  spring,  as  shown  in  the  figure.  In  this  position,  gas  is 
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Armature 


Cylinder  Port 


Return 


Figure  2:  Three-way  two-position  solenoid  valve. 


allowed  to  pass  between  the  normally  open  port  and  the 
cylinder  port.  To  energize  the  valve,  a  voltage  is  applied 
to  the  solenoid,  which  produces  an  electromagnetic  force 
that  moves  the  valve  stem  towards  its  energized  position 
until  it  contacts  the  seat.  In  this  position,  gas  is  allowed 
to  pass  between  the  normally  closed  port  and  the  cylinder 
port.  We  refer  to  the  de-energized  position  as  the  closed 
position,  and  the  energized  position  as  the  open  position. 

The  state  x  of  the  solenoid  valve  is  given  by 


x(t) 


x{t) 

v(t)  , 
J{t)_ 


where  x(t)  is  the  valve  position,  v(t)  is  the  valve  veloc¬ 
ity,  and  i(t)  is  the  solenoid  current.  We  define  x  =  0 
as  the  position  of  the  valve  when  in  the  closed  (de¬ 
energized)  position,  and  x  =  Ls  as  the  position  of  the 
valve  when  in  the  open  (energized)  position,  where  Ls  is 
the  length  of  the  valve  stroke. 

The  position  derivative  is  given  by  v(t),  and  the  veloc¬ 
ity  derivative  is  determined  from  the  forces  acting  on  the 
stem: 

—jr  =  —  iFe(t)  -  k(x{t)  -  Xo)  -  rv(t )  -  Fc(t)) , 

where  Fe(t)  is  the  electromagnetic  force,  k  is  the  return 
spring  constant  and  x0  is  the  amount  of  spring  compres¬ 
sion  when  the  valve  is  in  the  closed  position  (where  we 
lump  the  armature  and  return  spring  into  a  single  spring), 
r  is  the  kinetic  friction  coefficient,  and  Fc(t)  is  the  con¬ 
tact  force  with  the  seat,  which  may  be  described  by 


The  force  acts  to  decrease  the  reluctance  of  the  magnetic 
circuit  by  decreasing  the  air  gap,  which  is  a  function  of 
x,  thus  acting  to  open  the  valve.  The  solenoid  current  is 
described  by 


di(t) 

dt 


At  (»w- 


where  u(t)  is  the  applied  voltage,  and  R  is  the  coil  re¬ 
sistance  (Lyshevski  et  ah,  1999;  Rahman  et  ah,  1996; 
Szente  &  Vad,  2001).  The  voltage  u(t)  is  the  only  exter¬ 
nal  input  considered  here,  i.e., 

u  (t)  =  [«(t)] . 

The  inductance  of  a  solenoid  is  given  by 


L{x) 


N 2 

7 Z(xY 


where  N  is  the  number  of  wire  turns  in  the  coil,  and 
1Z  is  the  reluctance  of  the  magnetic  circuit.  In  general, 
reluctance  is  given  by 


where  l  is  the  length  of  the  magnetic  circuit,  A  is  the 
cross-sectional  area  of  the  circuit,  and  g  is  the  magnetic 
permeability  of  the  material.  If  we  define  the  maximum 
air  gap  as  go,  then  the  actual  air  gap  is  given  by  go  —  x. 
The  reluctance  depends  on  the  geometry  of  the  solenoid. 
We  may  assume  a  typical  geometry  in  which  reluctance 
is  described  by 


rw-,  /  ,  L  Qo  x 

K{x)  =  — V  + 

[icAc  go  Ag 


where  the  c  subscript  denotes  lumped  parameters  for 
the  core  and  armature,  /j0  is  the  permeability  of  air, 
and  As  is  the  effective  cross-sectional  area  of  the  air 
gap  (Lyshevski  et  al.,  1999).  Therefore,  the  inductance 
is  given  by 


L{x) 


N'2gpAggcAc 
Mo Aglc  T  gcAc(go  x'j 


and  its  derivative  with  respect  to  x  is 


dL(x) 

dx 


N2goAgg2cA2c 

(/ioAg(c  -f-  gcAc(go  at)) 


Fc(t) 


kc(—x),  if  x  <  0, 

0,  if  0  <  x  <  Ls, 

kc(x  -Lg),  if  x  >  Ls , 


where  kc  is  the  (large)  spring  constant  associated  with 
the  flexible  seats.  In  general,  we  may  also  consider 
forces  from  the  gas  flowing  through  the  valve,  however, 
here,  we  assume  a  balanced  design  in  which  the  pressure 
forces  always  cancel. 

The  solenoid  force  is  given  by 


Fe(t) 


d  L{x) 
dx 


where  L(x)  is  the  inductance  of  the  solenoid  (Lyshevski, 
Sinha,  &  Seger,  1999;  Rahman,  Cheung,  &  Lim,  1996). 


We  select  our  complete  measurement  vector  as 


y  (t) 


x(t) 

i(t) 

open(t)  ’ 
closed(t) 


The  open(t)  and  closed(t )  measurements  are  discrete 
sensors  which  output  1  if  the  valve  is  in  the  fully  opened 
or  fully  closed  state: 


open(t) 


closed(t) 


fl,  if  x(t)  >  Ls 
[0,  otherwise 

f  1,  if  x(t)  <  0 
1  0,  otherwise. 
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Figure  3:  Nominal  solenoid  valve  operation 


Fig.  3  shows  a  nominal  valve  cycle.  The  valve  is  com¬ 
manded  to  open  at  0  s.  The  current  and  magnetic  field 
build  up  in  the  solenoid,  and  soon,  enough  force  is  pro¬ 
duced  to  overcome  friction  and  the  return  spring.  As  the 

,  ,  .  .  dLix)  .  .  .  ,  . 

valve  moves,  the  i{t) — - — v(t)  term  begins  to  domt- 

ox 

nate,  causing  the  current  to  decrease.  When  the  valve 
opens  against  the  seat  and  stops  moving,  the  current  in¬ 
creases  again,  resulting  in  the  cusp  observed  in  the  cur¬ 
rent  just  before  0.05  s.  The  current  then  increases  to 
its  steady  state,  determined  by  the  applied  voltage  and 
the  coil  resistance.  At  0.25  seconds,  the  valve  is  com¬ 
manded  to  close  by  removing  the  applied  voltage.  The 
current  drains  out  of  the  solenoid,  and  soon,  the  electro¬ 
magnetic  force  is  no  longer  strong  enough  to  keep  the 
valve  in  place.  The  valve  begins  to  close,  and,  as  the 

.  .  dLix)  .  .  .  ... 

lit) — - — v(t)  term  again  comes  to  dominate,  the  cur- 

ox 

rent  increases  briefly  until  the  valve  fully  closes  and  v(t) 
becomes  0,  resulting  in  another  cusp.  The  current  then 
decreases  smoothly  to  0. 


3.1  Damage  Modeling 

In  our  modeling  methodology,  the  nominal  model  is  ex¬ 
tended  with  damage  models.  These  models  describe  how 
parameters  associated  with  the  degree  of  valve  damage 
progress  in  time,  and  allow  us  to  make  predictions  of 
damage  progression.  From  valve  documentation  and  his¬ 
torical  maintenance  records,  we  have  identified  the  most 
relevant  faults  for  prognostics.  The  set  of  faults  includes 
friction  damage,  spring  damage,  and  the  accumulation  of 
debris  on  the  valve  seats. 


A  common  damage  mechanism  present  in  valves  is 
sliding  wear  (Daigle  &  Goebel,  2009).  The  equation  for 
sliding  wear  takes  on  the  following  form: 

V{t)  =  w\F(t)v{t)\, 

where  V(t)  is  the  wear  volume,  w  is  the  wear  coef¬ 
ficient  (which  depends  on  material  properties  such  as 
hardness),  F(t)  is  the  sliding  force,  and  v(t)  is  the  slid¬ 
ing  velocity  (Hutchings,  1992).  Friction  will  increase 
linearly  with  sliding  wear,  because  the  contact  area  be¬ 
tween  the  sliding  bodies  becomes  greater  as  surface  as¬ 
perities  wear  down  (Hutchings,  1992).  We  characterize 
friction  damage  by  a  change  in  the  friction  coefficient, 
and  model  the  damage  progression  in  a  form  similar  to 
sliding  wear  (Daigle  &  Goebel,  2009): 

r(t)  =  wr\Ff(t)v(t)\ 

where  wr  is  the  wear  coefficient,  and  Ff(t)  is  the  friction 
force  defined  previously.  The  friction  parameter  only 
grows  when  the  valve  is  moving,  so,  the  friction  parame¬ 
ter  evolves  in  a  step-wise  fashion,  with  damage  only  oc¬ 
curring  during  the  valve’s  opening  and  closing  motions. 
As  the  friction  parameter  increases,  the  friction  force  in¬ 
creases,  further  increasing  the  rate  at  which  the  friction 
parameter  grows,  resulting  in  a  damage  progression  sim¬ 
ilar  to  an  exponential  when  viewed  at  large  time  scales. 
We  define  r+  as  the  largest  value  of  the  friction  coeffi¬ 
cient  at  which  the  valve  still  actuates  in  the  required  time. 
So,  TEOL{x(t),0(t))  =  1  if  r(t)  >  r+. 

We  assume  a  similar  equation  form  for  spring  dam¬ 
age  (Daigle  &  Goebel,  2009): 

k(t)  =  -wk\Fa(t)v(t)\, 

where  Wk  is  the  spring  wear  coefficient  and  Fs  (t )  is  the 
spring  force.  The  more  the  spring  is  used,  the  weaker  it 
becomes,  characterized  by  the  change  in  the  spring  con¬ 
stant.  As  with  friction  damage,  the  spring  constant  only 
decreases  when  the  valve  moves.  As  the  spring  becomes 
damaged,  the  spring  force  will  decrease,  and  so  the  rate 
at  which  spring  damage  occurs  will  also  decrease.  We 
define  k~  as  the  smallest  value  of  the  spring  constant 
at  which  the  valve  still  closes  in  the  required  time.  So, 
TEOL(x(t),6(t))  =  1  if  k{t)  <  k~. 

Another  failure  relates  to  the  accumulation  of  partic¬ 
ulate  matter  and  other  forms  of  debris  at  the  seats.  As 
debris  builds  up,  it  impedes  the  valve’s  travel  and  pre¬ 
vents  the  valve  from  fully  opening  or  closing,  which,  in 
turn,  causes  leaks  through  the  valve.  We  assume  that  the 
accumulation  of  debris  is  due  to  sliding  wear.  It  results 
in  a  change  in  the  boundary  conditions  of  the  valve  mo¬ 
tion.  We  define  Lc  as  the  boundary  when  the  valve  is 
in  the  closed  position  (nominally  0,  where  Lc  >  0),  and 
Ls  —  L0  as  the  boundary  when  in  the  open  position  (nom¬ 
inally  Ls,  where  L„  >  0).  We  assume  that  the  rates  of 
change  of  the  offsets  Lc  and  L0  grow  proportionally  to 
sliding  wear: 

Lc(t)  =  wc\Ff(t)v(t)\ 

L0(t)  =  w0\Ff(t)v(t)\. 

We  define  L+  and  L+  as  the  largest  allowable  values  of 
the  offsets.  So,  TEOi(x(i),  0(f))  =  1  if  Lc(t)  >  L+  or 
La(t)  >  L+. 
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Algorithm  1  SIR  Filter 

Inputs:  {{x.l-1,Ok-i)?™k-1}?=i,Uk-i-.k,yk 
Outputs:  {(xj.,0*),  w4h=i 

for  i  =  1  to  A  do 

0l~p(0k\9i_  i) 

w'k  <—  p(yfc|Xfe,  9k,  Ufe) 

end  for 

N 

w^j2wi* 

i= 1 

for  i  =  1  to  N  do 
wl  <-  wl/W 

end  for 

{(xL0fc)>u4}£i  <-  Resample({(x*fc,01fc),w^.}£1) 


4.  DAMAGE  ESTIMATION 

In  the  model-based  paradigm,  damage  estimation  re¬ 
duces  to  joint  state-parameter  estimation,  i.e.,  computa¬ 
tion  of  p(xk,  0fc|yo:fc)-  A  general  solution  to  this  prob¬ 
lem  is  the  particle  filter,  which  may  be  directly  applied 
to  nonlinear  systems  with  non-Gaussian  noise  terms. 
Particle  filters  offer  approximate  (suboptimal)  solutions 
for  systems  where  optimal  solutions  are  unavailable  or 
intractable  (Arulampalam,  Masked,  Gordon,  &  Clapp, 
2002;  Cappe,  Godsill,  &  Moulines,  2007).  In  particle 
filters,  the  state  distribution  is  approximated  by  a  set  of 
discrete  weighted  samples,  called  particles.  As  the  num¬ 
ber  of  particles  is  increased,  performance  increases  and 
the  optimal  solution  is  approached. 

With  particle  filters,  the  particle  approximation  to  the 
state  distribution  is  given  by 

{(x^W}£Li, 

where  N  denotes  the  number  of  particles,  and  for  particle 
i,  xj,  denotes  the  state  vector  estimate,  9\  denotes  the 
parameter  vector  estimate,  and  w\  denotes  the  weight. 
The  posterior  density  is  approximated  by 

N 

p(xk,ek |y0:fc)  «  5^«tfc5(xii0<)(dxfcd0fc), 

where  Slx,  gifidxkd9 k)  denotes  the  Dirac  delta  func- 
tion  located  at  (xj.,  9k). 

We  employ  the  sampling  importance  resampling  (SIR) 
particle  filter,  and  implement  the  resampling  step  us¬ 
ing  systematic  resampling  (Kitagawa,  1996).  The  pseu¬ 
docode  for  a  single  step  of  the  SIR  filter  is  shown  as  Al¬ 
gorithm  1 .  Each  particle  is  propagated  forward  to  time 
k  by  first  sampling  new  parameter  values  and  sampling 
new  states.  The  particle  weight  is  assigned  using  yk.  The 
weights  are  then  normalized,  followed  by  the  resampling 
step1. 

Here,  the  parameters  0k  evolve  by  some  unknown  pro¬ 
cess  that  is  independent  of  the  state  xk.  However,  we 
need  to  assign  some  type  of  evolution  to  the  parameters. 
The  typical  solution  is  to  use  a  random  walk,  i.e.,  for  pa¬ 
rameter  9,  9k  =  9k_i  +  £fc_i,  where  S,k-i  is  typically 

Pseudocode  for  the  systematic  resampling  algorithm  is 
provided  in  (Arulampalam  et  al.,  2002). 


Gaussian  noise.  With  this  type  of  evolution,  the  particles 
generated  with  parameter  values  closest  to  the  true  val¬ 
ues  should  be  assigned  higher  weight,  thus  allowing  the 
particle  filter  to  converge  to  the  true  values.  The  selected 
variance  of  the  random  walk  noise  determines  both  the 
rate  of  this  convergence  and  the  estimation  performance 
once  convergence  is  achieved. 

Note  that  in  a  particle  filter,  a  certain  amount  of  sen¬ 
sor  noise  must  be  assumed,  but,  in  practice,  the  discrete 
position  sensors  ( open  and  closed )  have  no  noise,  there¬ 
fore,  a  small  amount  of  noise  must  be  assumed  within 
the  particle  filter  for  those  sensors. 

5.  PREDICTION 

Prediction  is  initiated  at  a  given  time  kp.  Using  the  cur¬ 
rent  joint  state-parameter  estimate,  p(xkp,9kp\yo-.kp), 
which  represents  the  most  up-to-date  knowledge  of 
the  system  at  time  kp,  the  goal  is  to  compute 
p{EOLkp\y0:kp)  and  p{RULkp\y0:kp).  As  discussed 
in  Section  4.,  the  particle  filter  computes 

N 

p(.xkp,9kp\y0:kp)  «  5Z«tfcp5(xipieip)(dxfcpd0fcp). 

We  can  approximate  a  prediction  distribution  n  steps  for¬ 
ward  as  (Doucet,  Godsill,  &  Andrieu,  2000) 

Pi’X-kp+m  @kp-\-n \Y0:kp  )  ~ 

N 

Y  Wkr  5(xtp+„  ,»ip+ J  (dxkp+nd9k  p+n)- 

So,  for  a  particle  i  propagated  n  steps  forward  without 
new  data,  we  may  take  its  weight  as  w'Lkp .  Similarly,  we 
can  approximate  the  EOL  as 

N 

p{EOLkp\y0:kp)  «  YwIJeol^  jdEOLkp). 

i— 1 

To  compute  EOL,  then,  we  propagate  each  particle  for¬ 
ward  to  its  own  EOL  and  use  that  particle’s  weight  at  kp 
for  the  weight  of  its  EOL  prediction. 

If  an  analytical  solution  exists  for  the  prediction,  this 
may  be  directly  used  to  obtain  the  prediction  from  the 
state-parameter  distribution.  An  analytical  solution  is 
rarely  available,  so  the  general  approach  to  solving  the 
prediction  problem  is  through  simulation.  Each  parti¬ 
cle  is  simulated  forward  to  EOL  to  obtain  the  complete 
EOL  distribution.  The  pseudocode  for  the  baseline  pre¬ 
diction  procedure  is  given  as  Algorithm  2  (Daigle  & 
Goebel,  2010).  Each  particle  i  is  propagated  forward 
until  TporA'X-l,  9k)  evaluates  to  1;  at  this  point  EOL  has 
been  reached  for  this  particle. 

Note  that,  in  general,  we  may  sample  new  parameter 
values  9,  however,  the  noise  considered  here  should  typ¬ 
ically  be  considerably  less  than  the  noise  used  for  the 
random  walk  during  the  estimation  phase,  as  we  usu¬ 
ally  assume  these  parameters  are  either  constant  or  only 
exhibit  very  small  deviations.  Note  also  that  prediction 
requires  hypothesizing  future  inputs  of  the  system,  u/,, 
because  damage  progression  is  rarely  independent  of  the 
system  inputs.  For  this  reason  the  inputs  must  be  cho¬ 
sen  carefully.  Here,  we  assume  only  a  single  future  input 


5 


Annual  Conference  of  the  Prognostics  and  Health  Management  Society,  2010 


Algorithm  2  EOL  Prediction 

Inputs:  {(xip,eip),wip}fLi 
Outputs:  {EOL\p,wlp}i=1 

for  i  =  1  to  A  do 


k  <—  kp 


while  Teol(x.%,  6lk)  =  0  do 
Predict  uk 
0i+i~p(Ok+i\oi) 

Xfc+1  ~p(xfe+ l|x*fc,0lfc,Ufc) 
k  <—  k  +  1 

Y*  , 

xfc  xfc+l 

At  .  At 

“k  r'fc+l 

end  while 

EOL{p  <-  fc 

end  for 


trajectory,  i.e.,  ufc  is  defined  uniquely  for  all  values  of 
k.  This  is  a  practical  assumption  for  the  solenoid  valve, 
because  the  valve  is  always  fully  opened  or  fully  closed, 
and  a  single  voltage  value  u(t )  is  consistently  applied 
for  opening  the  valve.  Since  damage  occurs  only  when 
the  valve  is  moving,  then  for  the  purposes  of  prediction, 
we  may  produce  an  input  sequence  that  represents  a  full 
valve  cycle  (e.g.,  that  of  Fig.  3)  repeated  indefinitely, 
and,  using  this,  we  may  obtain  EOL  and  RUL  predic¬ 
tions  in  the  number  of  valve  cycles. 

5.1  Computationally  Efficient  Prediction 

The  computational  complexity  of  the  prediction  proce¬ 
dure  presented  as  Algorithm  2  is  linear  in  the  number 
of  particles,  however,  each  particle  may  take  a  variable 
amount  of  time  to  simulate  to  EOL.  Particles  that  predict 
quickly  progressing  wear  will  complete  quickly,  while 
particles  that  predict  slowly  progressing  wear  will  com¬ 
plete  slowly,  because  many  more  simulation  steps  will  be 
needed  to  reach  EOL.  This  problem  is  exacerbated  with 
models  that  require  very  small  sampling  periods.  In  fact, 
particles  with  very  poor  wear  parameter  estimates,  i.e., 
close  to  0,  which  correspond  to  very  large  EOL  predic¬ 
tions,  may  take  an  exceedingly  long  time.  Also,  these 
particles  may  correspond  to  outliers,  and,  as  such,  con¬ 
tribute  little  to  the  prediction  distribution. 

The  only  way  to  reduce  the  computational  effort  is  to 
reduce  the  number  of  particles  that  are  used  in  the  pre¬ 
diction  step.  One  approach  is  to  randomly  select  an  arbi¬ 
trary  number  of  particles  from  the  original  distribution, 
but  the  statistics  of  the  original  distribution  may  not  be 
preserved.  A  better  approach  is  to  sample  from  the  distri¬ 
bution  in  such  a  way  that  the  important  statistical  infor¬ 
mation  is  preserved,  and  the  EOL  distribution  computed 
from  this  limited  sample  set  closely  approximates  the 
statistical  properties  of  the  EOL  distribution  computed 
from  the  complete  set  of  samples. 

The  unscented  transform  solves  this  problem.  It  takes 
a  random  variable  x  €  R"x,  with  mean  x  and  covari¬ 
ance  Pxx,  which  is  related  to  a  second  random  vari¬ 
able  y  by  some  nonlinear  function  y  =  g(x),  and  com¬ 
putes  the  mean  y  and  covariance  Pyy  using  a  (small)  set 
of  deterministically  selected  weighted  samples,  called 
sigma  points  (Julier  &  Uhlmann,  1997).  For  the  task 


of  EOL  prediction,  x  is  simply  the  joint  state-parameter 
distribution  represented  by  {(x|.  ,  0lkp),  wikp}(L1,  g  is 
the  function  that  computes  EOL  (i.e.,  simulates  a  par¬ 
ticle  to  EOL),  and  y  is  the  EOL.  The  required  mean  x 
and  covariance  Pxx  may  be  computed  from  the  particle 
distributions  using  the  formulas  for  weighted  mean  and 
weighted  covariance. 

The  statistics  of  y  are  computed  by  selecting  a  set  of 
weighted  sigma  points  from  x,  where  AT,  denotes  the  ith 
point  and  Wi  denotes  its  weight.  The  sigma  points  are 
always  chosen  such  that  the  mean  and  covariance  match 
those  of  the  original  distribution,  x  and  Pxx.  Each  sigma 
point  is  passed  through  g  to  obtain  new  sigma  points  y, 
i.e., 

y,  =  g  (ato 

with  mean  and  covariance  calculated  as 

y  =  wty, 

i 

p  yy  =  Wi^’i  ~  ~  y)T- 

i 

The  underlying  idea  of  the  unscented  transform  is  that 
it  is  easier  to  approximate  the  distribution  x  than  to 
approximate  the  nonlinear  function  g.  This  is  the 
idea  behind  the  unscented  Kalman  filter,  where  the  un¬ 
scented  transform  is  exploited  for  nonlinear  state  estima¬ 
tion  (Julier  &  Uhlmann,  1997,  2004).  At  each  step,  the 
unscented  transform  is  applied  to  the  state  estimate  and 
is  used  for  a  single  step  prediction.  In  contrast,  here,  we 
apply  the  transform  to  the  state-parameter  distribution  at 
given  single  time  point  kp,  and  use  this  for  multi-step 
predictions  to  EOL. 

Several  methods  exist  for  selecting  sigma  points.  In 
the  following  sections,  we  briefly  review  three  common 
unscented  transforms,  and  compare  their  fidelity  on  an 
example  EOL  prediction  problem.  Detailed  performance 
results  will  be  presented  in  Section  6.  for  fault  prognosis 
of  the  solenoid  valve. 

Symmetric  Unscented  Transform 

In  the  symmetric  unscented  transform,  2 nx  +  1  sigma 
points  are  selected  symmetrically  about  the  mean  in  the 
following  way  (Julier  &  Uhlmann,  2004): 


matrix  square  root  of  (nx  +  k) Pxx  (e.g.,  computed  us¬ 
ing  the  Cholesky  decomposition).  The  number  k  is  a 
free  parameter  that  can  be  used  to  tune  the  higher  order 
moments  of  the  distribution.  If  x  is  assumed  Gaussian, 
then  selecting  k  =  .3  nx  is  recommended  (Julier  & 
Uhlmann,  1997).  A  smaller  value  of  n  will  bring  the 
sigma  points  closer  together.  Note  that  the  sigma  point 
weights  do  not  directly  represent  probabilities,  so  are  not 
restricted  to  the  interval  [0,1]. 
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Symmetric  Sigma  Points  (k  =  1)  Minimal  Skew  Simplex  Sigma  Points  (wq  =  0.5)  Spherical  Simplex  Sigma  Points  (wq  —  0.5) 
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Figure  4:  Sigma  point  locations  and  weights  for  a  two-dimensional  random  variable  x  with  x  =  0  and  Pxx  =  I.  The 
gray  dots  represent  the  random  samples,  and  the  markers  represent  the  sigma  points,  labeled  with  their  weights. 


Minimal  Skew  Simplex  Unscented  Transform  as 


The  symmetric  unscented  transform  uses  2 nx  +  1  sigma 
points,  however,  it  is  possible  to  reduce  the  number  of 
points  to  nx  +  2,  while  still  capturing  the  first  two  mo¬ 
ments  of  the  distribution,  thus  reducing  the  amount  of 
computation.  The  minimal  skew  sigma  points  are  such  a 
set,  and  satisfy  an  additional  constraint  in  which  the  skew 
(third  moment)  is  minimized,  which  reduces  the  average 
error  for  a  symmetric  distribution  (Julier  &  Uhlmann, 
2002). 

The  minimal  skew  sigma  points  are  selected  in  a  con¬ 
structive  manner,  first  by  choosing  the  set  of  points  for 
nx  =  1,  and  then  increasing  nx  by  one  until  the  full  di¬ 
mension  is  reached.  The  procedure  for  selecting  sigma 
points  for  dimension  nx  for  a  distribution  with  mean  0 
and  P xx  =  I,  where  I  is  the  identity  matrix,  is  as  fol¬ 
lows  (Julier  &  Uhlmann,  2002).  First,  the  weight  of  the 
0th  sigma  point  is  selected  freely  as 


i  =  0 

i  =  ; 

i  =  j  +  1 


where  0 j  is  a  vector  of  j  zeros.  The  points  form  a  sim¬ 
plex  (a  generalization  of  the  triangle  to  arbitrary  dimen¬ 
sions)  centered  about  the  origin,  with  an  additional  point 
located  at  the  origin  (Afo)- 

The  sigma  points  may  then  be  transformed  to  those  for 
mean  x  and  covariance  Pxx  using 


X'i  —  x  +  \/P  xxX  i , 


Wq  <E  [0,  1]. 


The  remaining  weights  are  computed  using 


where  \/Pxx  is  the  matrix  square  root  of  P The  trans¬ 
formed  sigma  points  y  and  its  statistics  are  computed  as 
in  the  basic  unscented  transform,  using  AT'. 


1  ~  Wq 
2nx  ’ 

2'-V, 


i  =  1,2 

i  =  3, . . .  ,nx  +  1 


For  the  initial  dimension  size  j  =  1,  where  X\  refers 
to  the  ith  sigma  point  for  the  yth  dimensional  space,  the 
sigma  points  are  initialized  as 


*o  =  0 


Spherical  Simplex  Unscented  Transform 

The  problem  identified  with  the  skew  simplex  set  of 
sigma  points  is  that  the  weights  vary  by  a  factor  of  2n“ 
and  the  point  coordinates  vary  by  a  factor  of  2nx'2, 
so  with  large  values  of  nx,  numerical  problems  may 
arise  (Julier,  2003).  The  spherical  simplex  points  still 
use  only  nx  +  2  points,  but  overcome  this  issue,  placing 
the  sigma  points  on  a  hypersphere  centered  at  the  ori¬ 
gin,  with  the  0th  sigma  point  located  at  the  origin.  These 
points  are  constructed  in  a  similar  fashion  to  the  minimal 
skew  sigma  points  as  follows  (Julier,  2003). 

First,  the  weight  of  the  0th  sigma  point  is  selected 
freely  as 


w0  e  [0,1]. 

The  remaining  weights  are  computed  using 


Expanding  up  to  higher  dimensions  j  =  2 , ,nx,  the  w.  =  _ 9. 

higher-dimensional  sigma  points  are  recursively  defined  n  +  1 
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The  sigma  points  for  dimensional  space  j  =  1  are  ini¬ 
tialized  again  as 

*1  =  0 

X\  =  --^= 

y/2w i 

X 1  =  1 

2  y/2wi' 


Expanding  up  to  higher  dimensions  j  =  2 , ,nx,  the 
higher-dimensional  sigma  points  are  recursively  defined 
as 


X3-  - 


vf-i 


-l 


X{ 

1 

0., 

P  j 


i  =  0 

*  =  i  + 1 


where  0j  is  a  vector  of  j  zeros.  These  points  may  then  be 
transformed  for  mean  x  and  covariance  Pxx  as  before. 

Fig.  4  compares  the  location  and  weights  for  a  two- 
dimensional  random  variable  for  the  three  different 
transforms.  The  mean  of  the  random  variable  is  0,  and 
the  covariance  is  I. 


Improved  Prediction  Procedure 

The  improved  prediction  procedure  uses  Algorithm  2, 
only  instead  of  the  inputs  being  the  particles  and  their 
weights,  the  inputs  become  the  sigma  points  and  their 
weights  computed  from  the  particle  distribution  at  time 
kp,  with  a  suitable  sigma  point  selection  algorithm. 
Fig.  5  shows  an  example  output  of  the  prediction  pro¬ 
cedure  for  spring  damage,  using  the  full  particle  distri¬ 
bution  (N  =  100).  Each  particle  creates  a  predicted  tra¬ 
jectory,  and  determines  a  single  EOF  prediction.  These 
individual  predictions  then  form  the  complete  prediction 
distribution. 

The  predicted  EOLs  based  on  simulating  the  sigma 
points  for  the  different  selection  algorithms  reviewed 
here  are  shown  in  Fig.  6.  The  free  parameters  were  se¬ 
lected  by  hand  in  this  particular  example.  The  predicted 
EOL  means  and  variances  are  shown  in  the  figure.  Recall 
that  the  aim  is  to  approximate  the  full  state-parameter 
distribution  using  a  small  set  of  samples,  such  that,  when 
transformed  to  EOL,  accurately  predict  the  mean  and 
variance  of  the  EOL  distribution  computed  using  the  full 
distribution.  An  under-  or  overapproximation  of  either 
statistic  is  undesirable,  as  it  misrepresents  the  EOL  cor¬ 
responding  to  the  current  belief  state.  For  this  example, 
in  each  case,  the  mean  EOL  predicted  with  the  sigma 
points  matches  the  mean  from  the  full  distribution  within 
0.2%  error.  The  variances  are  less  accurate,  with  around 
6%  error  for  the  symmetric  and  simplex  sigma  points, 
and  around  16%  error  for  the  spherical  sigma  points.  The 
error  in  predicted  variance  may  be  improved  with  bet¬ 
ter  selection  of  the  free  parameters.  An  improvement  in 
computation  time  of  18-24%  was  observed  for  the  sigma 
point  method.  In  this  case  nx  =  5,  so  the  symmetric 
set  has  11  sigma  points  and  the  simplex  methods  have  7 
points,  so  there  is  also  a  very  significant  improvement  in 
memory  requirements  for  prediction. 


Figure  5:  Predicted  trajectories  and  EOL  distribution  for 
spring  damage. 


EOL  Probability  Mass  Rmction 


Symmetric  Sigma  Points  (k  =  0.1) 


Time  (cycles) 


Time  (cycles) 


Skew  Simplex  Sigma  Points  (it)0  =  0.3)  Spherical  Simplex  Sigma  Points  (wq  =  0.14) 
0.3 


y  =  140.14 
P„,,  =  30.93 


,  0.2 


Time  (cycles) 


120  130  140  150  160 

Time  (cycles) 


Figure  6:  EOL  predictions  based  on  sigma  points. 


6.  RESULTS 

In  this  section,  we  evaluate  the  prognostics  performance 
for  the  different  prediction  methods.  In  each  case,  we 
predict  using  the  full  particle  distribution,  the  symmetric 
sigma  points,  the  minimal  skew  simplex  sigma  points, 
and  the  spherical  simplex  sigma  points,  in  order  to  com¬ 
pare  the  accuracy,  precision,  and  computational  cost  of 
the  prediction. 

Estimation  accuracy  is  evaluated  using  percentage 
root  mean  square  error  (PRMSE),  which  expresses  rel¬ 
ative  estimation  accuracy  as  a  percentage: 


PRMSE  =  100 


Mean^. 
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Table  1:  Prognostics  Performance  Results  for  N  =  500  and  M  =  {x,  i,  open,  closed} 


Fault 

PRMSE 

RSD,„ 

RA 

RSDijj/i, 

T  cpu 

Full 

Sym. 

Skew 

Sph. 

Full 

Sym. 

Skew 

Sph. 

Max 

Sym. 

Skew 

Sph. 

r 

3.72 

26.09 

94.39 

94.49 

93.90 

94.48 

19.89 

18.40 

21.35 

18.48 

72.98 

57.39 

53.91 

60.21 

k 

3.90 

14.74 

96.19 

96.20 

96.12 

96.17 

15.21 

14.27 

15.35 

14.42 

61.46 

58.05 

55.85 

58.06 

Lc 

5.01 

18.01 

93.62 

93.85 

93.82 

93.77 

22.65 

20.18 

22.35 

21.63 

77.22 

63.32 

61.38 

61.41 

L0 

3.05 

17.89 

95.48 

95.59 

95.12 

95.36 

16.95 

16.20 

19.30 

18.61 

67.93 

56.70 

52.69 

53.56 

where  Wk  denotes  the  estimated  wear  parameter  value 
at  time  k,  w}  denotes  the  true  wear  parameter  value  at 
/>:,  and  Mean;,,  denotes  the  mean  over  all  values  of  k. 
Estimation  spread  is  calculated  using  relative  standard 
deviation  (RSD),  computed  for  the  wear  parameter  dis¬ 
tribution  at  each  prediction  point  (every  10  cycles),  and 
averaged  over  all  prediction  points.  The  average  is  de¬ 
noted  as  RSD.  In  computing  both  PRMSE  and  RSD, 
we  ignore  the  initial  time  period  associated  with  estima¬ 
tion  convergence.  Convergence  of  the  wear  parameter 
estimate,  Cw ,  is  computed  based  on  the  definition  of  the 
convergence  metric  described  in  (Saxena  et  al.,  2008), 
where  the  convergence  of  a  curve  is  expressed  as  the  dis¬ 
tance  from  the  origin  to  the  centroid  under  the  curve  (a 
shorter  distance  is  better).  We  use  the  absolute  error  of 
the  hidden  parameter  estimate  as  the  curve. 

For  a  given  prediction  point  kp ,  we  compute  measures 
of  accuracy  and  spread.  For  accuracy,  we  use  the  relative 
accuracy  (RA)  metric  (Saxena,  Celaya,  Saha,  Saha,  & 
Goebel,  2009): 


R  ^kp 


100 


I  RUL*kp 


-  Mean, (RUL}p)\ 

RUL*kp 


We  calculate  prediction  spread  using  RSD,  which  we  de¬ 
note  as  RSDjj[/l-  Both  RA  and  RSD  are  averaged  over 
all  prediction  points  starting  from  the  prediction  at  which 
a  prognostics  horizon  (RA  within  a  specified  bound)  is 
first  reached  (denoted  using  RA  and  RSD^ux). 

In  order  to  measure  the  computational  performance,  at 
each  prediction  point  we  measure  the  time  taken  for  the 
prediction  to  be  completed,  tcpu(kp).  For  a  given  pre¬ 
diction  method,  we  then  compute  the  percent  improve¬ 
ment  over  the  time  for  the  full  distribution,  tlpll(kp), 
defined  as 


Tcpu  =  100 


H pu(kP)  -  tcpu(kp)  I 


i full 


(kp) 


This  metric  is  then  averaged  over  all  kp,  denoted  as 
Tcpu,  to  summarize  percent  improvement  over  the  en¬ 
tire  experiment.  We  characterize  the  maximum  possible 
performance  increase  by  computing  Tcpu  for  the  pre¬ 
diction  using  a  single  point  representing  the  mean  of 
the  state-parameter  distribution.  This  performance  can 
be  achieved  with  the  sigma  point  method  by  selecting 
a  small  enough  value  of  k  or  vjq  such  that  all  the  sigma 
points  are  concentrated  on  the  mean,  however,  this  would 
result  in  a  vast  underapproximation  to  the  variance. 

We  consider  the  case  where  only  a  single  damage 
mode  is  actively  progressing.  Table  1  shows  the  per¬ 
formance  for  each  fault  for  N  =  500,  and  taking  the 


complete  measurement  set  M.  The  random  walk  vari¬ 
ances  were  chosen  as  fixed  values  assuming  that  the  or¬ 
ders  of  magnitude  of  the  wear  parameters  were  known. 
Overall,  the  unknown  wear  parameter  can  be  estimated 
well.  The  desired  outcome  is  that  the  computed  RA  and 
RSD pu p  using  the  sigma  point  methods  closely  approx¬ 
imate  those  produced  using  the  full  distribution.  In  the 
case  of  RA,  the  sigma  point  methods  are  within  0.5% 
of  the  RA  calculated  using  the  full  distribution,  mean¬ 
ing  that  the  means  of  the  distribution  (from  which  RA 
is  calculated)  are  predicted  well.  Farger  differences  are 
observed  when  comparing  RSD pjjl,  as  in  most  cases 
the  sigma  point  methods  underapproximate  the  variance, 
with  a  worst-case  error  of  6-14%.  The  accuracy  of  vari¬ 
ance  prediction  depends  on  the  selected  values  of  the  free 
parameters  of  the  unscented  transforms,  as  correctly  se¬ 
lected  values  will  lead  to  better  approximations.  In  this 
case,  we  selected  the  suggested  value  of  k  for  the  sym¬ 
metric  sigma  points,  and  this  seemed  to  work  well  in 
all  cases.  For  the  minimal  skew  simplex  sigma  points, 
we  chose  u>0  =  —  1  for  r,  Lc,  and  L0,  and  vjq  =  0.1 
for  k.  For  the  spherical  simplex  sigma  points,  we  chose 
Wo  =  0.1  for  k,  Lc,  and  L0,  and  wq  =  —  1  for  r.  These 
values  were  selected  manually. 

Over  50%  improvement  in  computation  time  was  ob¬ 
served  in  all  cases,  coming  within  75-90%  of  the  maxi¬ 
mum  possible  improvement.  Further,  only  a  fraction  of 
the  samples  are  used  in  the  sigma  point  methods,  sav¬ 
ing  significantly  on  memory.  At  the  selected  prediction 
points,  the  valve  is  in  a  closed  state,  and  the  effective 
nx  is  only  3  (i.e.,  only  3  of  the  states  have  different  val¬ 
ues  between  particles).  Therefore,  for  the  symmetric  un¬ 
scented  transform  only  2 nx  +  1  =  7  sigma  points  are 
required,  and  only  nx  +  2  =  5  are  required  for  the  min¬ 
imal  skew  simplex  and  spherical  simplex  sigma  points. 
For  N  =  500  this  is  an  improvement  of  over  98%  in 
memory  usage. 

To  explore  further,  we  focus  on  the  case  of  spring 
damage,  and  vary  the  number  of  particles  and  the  mea¬ 
surement  set.  Table  2  shows  the  estimation  results.  Over¬ 
all,  the  unknown  wear  parameter  can  be  estimated  well 
with  both  N  =  100  and  N  =  500.  This  is  also  true 
when  the  measurement  set  is  varied,  in  fact,  using  only 
the  open  and  closed  indicators,  prognostics  can  still  be 
performed,  but  at  the  cost  of  a  wider  variance  in  the  pre¬ 
diction  and  slower  convergence.  With  more  particles, 
PRMSE  improves  and  RSD  generally  increases  slightly. 
Convergence  is  somewhat  better  with  fewer  particles  as 
the  filter  tends  to  be  more  aggressive,  whereas  additional 
particles  smooth  the  behavior.  Of  course,  with  too  few 
particles,  convergence  may  not  occur,  therefore  a  rea¬ 
sonably  large  N  must  usually  be  chosen. 

Table  3  shows  the  prediction  performance.  RA  is 


9 


Annual  Conference  of  the  Prognostics  and  Health  Management  Society,  2010 


Table  3:  Comparison  of  Prognostics  Prediction  Methods  for  Spring  Damage 


M 

N 

RA 

RSD 

RUL 

T 

-*■  cpu 

Full 

Sym. 

Skew 

Sph. 

Full 

Sym. 

Skew 

Sph. 

Max 

Sym. 

Skew 

Sph. 

{x,  i,  open,  closed} 

100 

94.59 

94.64 

94.59 

94.56 

15.01 

13.88 

14.96 

14.03 

48.08 

42.98 

23.81 

29.98 

{a;,  i,  open,  closed} 

500 

96.19 

96.20 

96.12 

96.17 

15.21 

14.27 

15.35 

14.42 

61.46 

58.05 

55.85 

58.06 

{x,i} 

100 

93.96 

94.00 

93.88 

94.00 

15.53 

14.73 

15.81 

14.90 

45.49 

38.98 

22.49 

26.02 

{x,i} 

500 

97.16 

97.18 

97.16 

97.22 

15.39 

14.30 

15.46 

14.48 

62.39 

58.97 

56.40 

58.81 

{x} 

100 

93.68 

93.82 

93.65 

93.74 

18.17 

16.58 

17.83 

16.79 

50.89 

42.83 

27.79 

31.37 

{x} 

500 

94.85 

95.02 

94.72 

94.99 

18.25 

16.68 

18.13 

16.93 

67.39 

62.92 

60.81 

63.09 

« 

100 

93.34 

93.34 

93.13 

93.31 

16.65 

15.51 

17.23 

16.22 

47.01 

37.62 

22.19 

25.65 

« 

500 

94.61 

94.77 

94.49 

94.77 

18.32 

16.83 

18.34 

17.14 

65.25 

60.80 

57.84 

60.52 

{open,  closed} 

100 

94.46 

94.74 

94.28 

94.65 

22.84 

20.13 

22.83 

21.48 

48.95 

38.53 

16.73 

36.78 

{open,  closed} 

500 

94.20 

94.58 

94.14 

94.40 

23.62 

20.50 

23.25 

21.55 

74.61 

69.16 

66.20 

68.29 

Table  2:  Damage  Estimation  Performance  for  Spring 
Damage 


M 

N 

PRMSE 

RSDW 

cw 

{x,  i,  open,  closed} 

100 

4.62 

14.81 

32.90 

{x,  i,  open,  closed} 

500 

3.90 

14.74 

32.31 

{x,i} 

100 

5.83 

14.58 

33.79 

{x,i} 

500 

3.87 

14.93 

34.37 

{x} 

100 

3.99 

16.44 

38.13 

{x} 

500 

3.10 

16.83 

35.39 

100 

5.16 

15.74 

40.72 

W 

500 

4.61 

16.45 

34.90 

{open,  closed} 

100 

3.47 

21.47 

42.55 

{open,  closed} 

500 

3.00 

21.89 

42.35 

98 


97 

is 

«  96 
95 


•  Minimal  Skew  Simplex 
- Full 


94-1 - T - T - T - T - . - T - T - T - 1~ 

-1  -0.8  -0.6  -0.4  -0.2  0  0.2  0.4  0.6  0.8 

w0 


15  Jr  i  i 


»—  r  ~w  —  m~  •  -* - - 


estimated  within  similar  error  bounds  as  in  Table  1. 
Again,  the  sigma  point  methods  usually  underapproxi¬ 
mate  RSDflf/i-  With  fewer  particles,  the  gain  in  com¬ 
putational  efficiency  is  smaller,  as  expected,  but  gains 
of  20-40%  are  still  observed  with  N  =  100,  coming 
within  30-90%  of  the  maximum  possible  increase,  and 
the  memory  usage  improves  by  at  least  93%  (i.e.,  100 
particles  compared  to  at  most  7  sigma  points).  Notice 
also  that  for  the  cases  where  RSD/y// /\  is  larger,  such  as 
with  M  =  {open,  closed },  the  savings  are  even  greater, 
i.e.,  the  wider  the  full  particle  distribution,  the  more  of  a 
savings  the  sigma  point  methods  can  offer. 

Overall,  these  results  demonstrate  that  prediction  can 
be  achieved  much  more  efficiently  with  limited  devia¬ 
tions  in  prediction  performance.  The  symmetric  sigma 
points  seemed  to  provide  the  largest  improvement  in  time 
efficiency,  but  underapproximated  the  variance  the  most. 
These  two  effects  are  interrelated.  The  smallest  values 
of  the  wear  parameter  in  the  full  distribution  contribute 
most  to  the  time  cost,  so  sigma  points  concentrated  more 
towards  the  mean  take  less  time  to  simulate.  The  mini¬ 
mal  skew  simplex  sigma  points  came  closest  to  the  vari¬ 
ance  of  the  true  distribution,  but  usually  with  the  smallest 
time  improvement. 

The  biggest  practical  difficulty  in  applying  this 
method  is  the  selection  of  the  free  parameters,  as  this  re¬ 
lates  to  performance.  Using  the  symmetric  sigma  points 


Figure  7:  Prognostics  performance  for  different  values 
of  w o  for  the  minimal  skew  simplex  sigma  points,  for  k 
with  N  =  500  and  M  =  {x,  i,  open,  closed}. 


with  the  suggested  value  of  k  seemed  to  always  work 
well,  requiring  no  further  tuning.  There  is  no  heuristic 
available  in  the  literature  for  the  minimal  skew  simplex 
and  spherical  simplex  sigma  points.  In  order  to  examine 
the  sensitivity  of  the  selected  value  of  the  free  parameter 
on  performance,  we  varied  the  value  of  Wg  for  the  min¬ 
imal  skew  simplex  sigma  points  for  the  case  of  spring 
damage  with  N  =  500  and  M  =  {x,i,  open,  closed}. 
According  to  Table  1,  the  full  distribution  achieves 
RA  =  96.19%  and  RSD RUL  =  15.21  cycles.  Fig.  7 
illustrates  how  these  metrics  vary  over  the  selected  range 
of  wo-  In  general,  a  large  value  of  wg  will  spread  out  the 
sigma  points,  and  therefore  increase  the  predicted  vari¬ 
ance.  For  ivg  £  [—1.0,  0.5],  RA  and  RSD  have  little 
deviation  from  the  desired  values  produced  by  the  full 
distribution,  so  any  value  within  this  range  will  result  in 
an  acceptable  approximation.  But,  for  w o  £  (0.5,  0.9], 
the  approximated  RSD  begins  to  increase  significantly, 
and  RA  decreases  with  it  at  a  much  smaller  rate. 
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7.  CONCLUSIONS 

In  this  paper,  we  developed  a  computationally  efficient 
prediction  scheme  for  model-based  prognostics  based  on 
the  unscented  transform.  The  unscented  transform  al¬ 
lows  the  statistics  of  a  distribution  passed  through  a  non¬ 
linear  transformation  to  be  predicted  using  a  minimal  set 
of  deterministically  selected  samples.  Applying  this  to 
the  prognostics  problem,  we  are  able  to  predict  the  mean 
and  variance  of  the  EOL  accurately,  and  with  improved 
computational  efficiency  and  significantly  reduced  mem¬ 
ory  costs. 

Particle  filtering  approaches  have  become  a  popu¬ 
lar  choice  for  model-based  prognostics  (e.g.,  (Saha  & 
Goebel,  2009;  Abbas,  Ferri,  Orchard,  &  Vachtsevanos, 
2007)).  The  most  significant  disadvantage  is  the  compu¬ 
tational  complexity,  as  usually  a  large  number  of  par¬ 
ticles  are  needed  for  accurate  estimation,  and,  subse¬ 
quently,  prediction.  A  related  approach  to  efficient  pre¬ 
diction  is  described  in  (Orchard,  Kacprzynski,  Goebel, 
Saha,  &  Vachtsevanos,  2008),  however,  in  this  approach, 
random  sampling  with  a  smaller  number  of  particles  is 
advocated.  As  described  in  Section  5.,  a  large  number  of 
randomly  selected  particles  are  needed  to  correctly  ap¬ 
proximate  the  statistics  of  the  prediction  based  on  the  full 
particle  set,  so  a  significant  number  of  particles  would 
still  be  needed.  However,  the  method  described  in  this 
paper  selects  only  the  minimal  number  of  points  neces¬ 
sary  to  capture  those  statistics.  This  number  is  depen¬ 
dent  only  on  the  dimension  of  the  state  space.  This  ap¬ 
proach  is  also  applicable  when  a  technique  other  than 
particle  filters  is  used  for  the  estimation  task,  as  long  as 
the  method  provides  a  state  distribution  which  is  to  be 
propagated  forward  to  EOL.  Note  that  if  the  unscented 
Kalman  filter  is  used,  the  sigma  points  are  already  avail¬ 
able  for  prediction. 

The  unscented  transform  is  not  limited  to  Gaussian 
distributions,  but  the  prediction  method  based  on  it  is 
useful  only  when  the  mean  and  variance  of  the  EOL  dis¬ 
tribution  are  meaningful  statistics.  For  example,  for  a 
multi-modal  distribution,  a  single  mean  and  variance  are 
not  meaningful.  This  could  be  the  case  when  multiple  fu¬ 
ture  input  trajectories  are  considered.  In  this  case,  each 
mode  is  associated  with  one  of  these  trajectories,  and 
each  may  be  defined  by  a  mean  and  variance.  Therefore, 
the  method  could  be  applied  to  each  case  individually  to 
obtain  the  means  and  variances  of  the  different  modes. 

As  part  of  future  work,  it  is  important  to  determine 
strategies  for  selecting  the  free  parameters  of  the  differ¬ 
ent  unscented  transforms,  as  this  is  the  main  hurdle  to 
practical  implementation.  As  discussed  in  Section  6., 
the  suggested  heuristic  for  selecting  k  for  the  symmet¬ 
ric  sigma  points  worked  well.  For  the  remaining  meth¬ 
ods,  selection  of  iijq  may  be  difficult,  although  a  value 
between  —1  and  0.1  worked  well  here.  Further,  scaling 
of  the  sigma  points  can  also  be  performed,  which  intro¬ 
duces  additional  free  parameters  (Julier,  2002).  Bringing 
the  sigma  points  closer  together  will  speed  up  computa¬ 
tion,  but  it  should  be  ensured  that  the  variance  estimate 
remains  accurate.  A  detailed  analysis  over  this  param¬ 
eter  space  is  necessary  to  suggest  useful  heuristics  in 
the  context  of  prognostics.  One  may  then  envision  auto¬ 
matic  methods  to  tune  the  free  parameters  to  achieve  the 
desired  spread  in  sigma  points  to  correctly  approximate 
EOL  mean  and  variance.  Extensions  of  the  unscented 


transform  to  prediction  of  higher-order  moments  such  as 
skew  and  kurtosis  may  also  be  useful  for  prognostics. 
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